智能论文笔记

Supervised Anomaly Detection Method Combining Generative Adversarial Networks and Three-Dimensional Data in Vehicle Inspections

Yohei Baba , Takuro Hoshi , Ryosuke Mori , Gaurang Gavai

分类：计算机视觉 | 机器学习

2022-12-22

The external visual inspections of rolling stock's underfloor equipment are currently being performed via human visual inspection. In this study, we attempt to partly automate visual inspection by investigating anomaly inspection algorithms that use image processing technology. As the railroad maintenance studies tend to have little anomaly data, unsupervised learning methods are usually preferred for anomaly detection; however, training cost and accuracy is still a challenge. Additionally, a researcher created anomalous images from normal images by adding noise, etc., but the anomalous targeted in this study is the rotation of piping cocks that was difficult to create using noise. Therefore, in this study, we propose a new method that uses style conversion via generative adversarial networks on three-dimensional computer graphics and imitates anomaly images to apply anomaly detection based on supervised learning. The geometry-consistent style conversion model was used to convert the image, and because of this the color and texture of the image were successfully made to imitate the real image while maintaining the anomalous shape. Using the generated anomaly images as supervised data, the anomaly detection model can be easily trained without complex adjustments and successfully detects anomalies.

translated by 谷歌翻译

Influence of collaborative customer service by service robots and clerks in bakery stores

Yuki Okafuji , Sichao Song , Jun Baba , Yuichiro Yoshikawa , Hiroshi Ishiguro

分类：机器人

2022-12-20

In recent years, various service robots have been introduced in stores as recommendation systems. Previous studies attempted to increase the influence of these robots by improving their social acceptance and trust. However, when such service robots recommend a product to customers in real environments, the effect on the customers is influenced not only by the robot itself, but also by the social influence of the surrounding people such as store clerks. Therefore, leveraging the social influence of the clerks may increase the influence of the robots on the customers. Hence, we compared the influence of robots with and without collaborative customer service between the robots and clerks in two bakery stores. The experimental results showed that collaborative customer service increased the purchase rate of the recommended bread and improved the impression regarding the robot and store experience of the customers. Because the results also showed that the workload required for the clerks to collaborate with the robot was not high, this study suggests that all stores with service robots may show high effectiveness in introducing collaborative customer service.

translated by 谷歌翻译

Composition, Attention, or Both?

Ryo Yoshida , Yohei Oseki

分类：自然语言处理

2022-10-24

In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components -- the composition function and the self-attention mechanism -- can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.

translated by 谷歌翻译

Service Robots in a Bakery Shop: A Field Study

Sichao Song , Baba Jun , Junya Nakanishi , Yuichiro Yoshikawa , Hiroshi Ishiguro

分类：机器人

2022-08-19

在本文中，我们报告了一项现场研究，在该研究中，我们在面包店使用了两个服务机器人作为促销活动。先前的研究探索了公共公共公众公共应用，例如购物中心。但是，需要更多的证据表明，服务机器人可以为真实商店的销售做出贡献。此外，在促销促销的背景下，客户和服务机器人的行为尚未得到很好的检查。因此，可以认为有效的机器人行为类型，并且客户对这些机器人的反应尚不清楚。为了解决这些问题，我们在面包店安装了两个远程操作的服务机器人将近2周，一个在入口处作为招待员，另一个在商店里推荐产品。结果表明，在应用机器人时，销售额急剧增加。此外，我们注释了机器人和客户行为的视频录制。我们发现，尽管放置在入口处的机器人成功吸引了路人的兴趣，但没有观察到访问商店的客户数量明显增加。但是，我们确认商店内部运行的机器人的建议确实产生了积极影响。我们详细讨论我们的发现，并为未来的研究和应用提供理论和实用建议。

translated by 谷歌翻译

Description and Discussion on DCASE 2022 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques

Kota Dohi , Keisuke Imoto , Noboru Harada , Daisuke Niizumi , Yuma Koizumi , Tomoya Nishida , Harsh Purohit , Takashi Endo , Masaaki Yamamoto , Yohei Kawaguchi

分类：机器学习 | (统计)机器学习

2022-06-13

我们介绍了声学场景和事件的检测和分类的任务描述（DCASE）2022挑战任务2：“用于应用域通用技术的机器状况监控的无监督异常的声音检测（ASD）”。域转移是ASD系统应用的关键问题。由于域移位可以改变数据的声学特征，因此在源域中训练的模型对目标域的性能较差。在DCASE 2021挑战任务2中，我们组织了一个ASD任务来处理域移动。在此任务中，假定已知域移位的发生。但是，实际上，可能不会给出每个样本的域，并且域移位可能会隐含。在2022年的任务2中，我们专注于域泛化技术，这些技术检测异常，而不论域移动如何。具体而言，每个样品的域未在测试数据中给出，所有域仅允许一个阈值。我们将添加挑战结果和挑战提交截止日期后提交的分析。

translated by 谷歌翻译

Hierarchical Conditional Variational Autoencoder Based Acoustic Anomaly Detection

Harsh Purohit , Takashi Endo , Masaaki Yamamoto , Yohei Kawaguchi

分类：机器学习 | 人工智能

2022-06-11

本文旨在开发一种基于声学信号的无监督异常检测方法来自动机器监测。现有的方法，例如Deep AutoCoder（DAE），变异自动编码器（VAE），条件变异自动编码器（CVAE）等在潜在空间中的表示功能有限，因此，异常检测性能差。必须为每种不同类型的机器培训不同的模型，以准确执行异常检测任务。为了解决此问题，我们提出了一种新方法，称为层次条件变化自动编码器（HCVAE）。该方法利用有关工业设施的可用分类学等级知识来完善潜在空间表示。这些知识也有助于模型改善异常检测性能。我们通过使用适当的条件证明了单个HCVAE模型对不同类型机器的概括能力。此外，为了显示拟议方法的实用性，（i）我们在不同领域评估了HCVAE模型，（ii）我们检查了部分分层知识的影响。我们的结果表明，HCVAE方法验证了这两个点，并且在AUC得分度量上最大的15％在异常检测任务上的基线系统的表现优于基线系统。

translated by 谷歌翻译

Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors

Shota Horiguchi , Shinji Watanabe , Paola Garcia , Yuki Takashima , Yohei Kawaguchi

分类：自然语言处理

2022-06-06

A method to perform offline and online speaker diarization for an unlimited number of speakers is described in this paper. End-to-end neural diarization (EEND) has achieved overlap-aware speaker diarization by formulating it as a multi-label classification problem. It has also been extended for a flexible number of speakers by introducing speaker-wise attractors. However, the output number of speakers of attractor-based EEND is empirically capped; it cannot deal with cases where the number of speakers appearing during inference is higher than that during training because its speaker counting is trained in a fully supervised manner. Our method, EEND-GLA, solves this problem by introducing unsupervised clustering into attractor-based EEND. In the method, the input audio is first divided into short blocks, then attractor-based diarization is performed for each block, and finally, the results of each block are clustered on the basis of the similarity between locally-calculated attractors. While the number of output speakers is limited within each block, the total number of speakers estimated for the entire input can be higher than the limitation. To use EEND-GLA in an online manner, our method also extends the speaker-tracing buffer, which was originally proposed to enable online inference of conventional EEND. We introduce a block-wise buffer update to make the speaker-tracing buffer compatible with EEND-GLA. Finally, to improve online diarization, our method improves the buffer update method and revisits the variable chunk-size training of EEND. The experimental results demonstrate that EEND-GLA can perform speaker diarization of an unseen number of speakers in both offline and online inferences.

translated by 谷歌翻译

MTTrans: Cross-Domain Object Detection with Mean-Teacher Transformer

Jinze Yu , Jiaming Liu , Xiaobao Wei , Haoyi Zhou , Yohei Nakata , Denis Gudovskiy , Tomoyuki Okuno , Jianxin Li , Kurt Keutzer , Shanghang Zhang

分类：计算机视觉

2022-05-03

最近，检测变压器（DETR）是一种端到端对象检测管道，已达到有希望的性能。但是，它需要大规模标记的数据，并遭受域移位，尤其是当目标域中没有标记的数据时。为了解决这个问题，我们根据平均教师框架MTTRANS提出了一个端到端的跨域检测变压器，该变压器可以通过伪标签充分利用对象检测训练中未标记的目标域数据和在域之间的传输知识中的传输知识。我们进一步提出了综合的多级特征对齐方式，以改善由平均教师框架生成的伪标签，利用跨尺度的自我注意事项机制在可变形的DETR中。图像和对象特征在本地，全局和实例级别与基于域查询的特征对齐（DQFA），基于BI级的基于图形的原型对齐（BGPA）和Wine-Wise图像特征对齐（TIFA）对齐。另一方面，未标记的目标域数据伪标记，可用于平均教师框架的对象检测训练，可以导致更好的特征提取和对齐。因此，可以根据变压器的架构对迭代和相互优化的平均教师框架和全面的多层次特征对齐。广泛的实验表明，我们提出的方法在三个领域适应方案中实现了最先进的性能，尤其是SIM10K到CityScapes方案的结果，从52.6地图提高到57.9地图。代码将发布。

translated by 谷歌翻译

Environmental Sound Extraction Using Onomatopoeia

Yuki Okamoto , Shota Horiguchi , Masaaki Yamamoto , Keisuke Imoto , Yohei Kawaguchi

分类：机器学习

2021-12-01

拟声术语是语音上模仿声音的字符序列，在表达声音的特征，诸如持续时间，间距和Timbre的特征是有效的。我们提出了一种使用拟声缺陷的环境 - 辐射方法，以指定要提取的目标声音。利用这种方法，我们通过使用U-Net架构来估计来自输入混合谱图和拟声型的时频掩模，然后通过掩蔽频谱图来提取相应的目标声音。实验结果表明，该方法只能提取对应于拟声病的目标声音，并且比使用声音事件类别指定目标声音的传统方法更好地执行。

translated by 谷歌翻译